06 - Project 1

🧬 Project: Neural Classification of Erythrocyte Anomalies

1. Project Overview

In low-resource hematology settings, manual screening of blood smears for intracellular parasites is time-consuming and error-prone. This project aims to automate the triage process by developing a Deep Learning model capable of distinguishing between healthy erythrocytes (red blood cells) and those containing a specific intracellular pathogen.

Your task is to design, train, and validate a Convolutional Neural Network (CNN) to perform binary classification on single-cell images.

2. The Dataset

Dataset download link: Dataset

You are provided with a proprietary dataset consisting of segmented “patches” (Regions of Interest) extracted from thin blood smear slides stained with Giemsa. Each image contains a single cell.

The data has been anonymized and split into two distinct sets:

⚠️ Data Note: The images possess varying resolutions and aspect ratios. A crucial part of your pipeline will be establishing a robust pre-processing strategy to normalize these inputs before feeding them into your network.

3. Technical Objectives

A. Data Pre-processing & Augmentation

Since the input dimensions vary, you must implement a pipeline to:

  1. Resize/Rescale images to a fixed input size (e.g., \(64\times64\), \(128\times128\), or \(224\times224\)) suitable for your architecture.
  2. Normalize pixel intensity values.
  3. Implement Data Augmentation on the training set to prevent overfitting. Consider rotations, flips, and brightness adjustments to simulate varying lighting conditions in microscopy.

B. Neural Network Architecture

You are required to construct a Convolutional Neural Network. You may choose one of two paths:

C. Training Loop

4. Deliverables

Part 1: Short report

Your short report should be a PDF document containing the following information:

Part 2: The “Blind” Test Submission

You must run your final, trained model on the images in the test folder.